157 research outputs found
When two trees go to war
Rooted phylogenetic networks are often constructed by combining trees,
clusters, triplets or characters into a single network that in some
well-defined sense simultaneously represents them all. We review these four
models and investigate how they are related. In general, the model chosen
influences the minimum number of reticulation events required. However, when
one obtains the input data from two binary trees, we show that the minimum
number of reticulations is independent of the model. The number of
reticulations necessary to represent the trees, triplets, clusters (in the
softwired sense) and characters (with unrestricted multiple crossover
recombination) are all equal. Furthermore, we show that these results also hold
when not the number of reticulations but the level of the constructed network
is minimised. We use these unification results to settle several complexity
questions that have been open in the field for some time. We also give explicit
examples to show that already for data obtained from three binary trees the
models begin to diverge
Trinets encode tree-child and level-2 phylogenetic networks
Phylogenetic networks generalize evolutionary trees, and are commonly used to
represent evolutionary histories of species that undergo reticulate
evolutionary processes such as hybridization, recombination and lateral gene
transfer. Recently, there has been great interest in trying to develop methods
to construct rooted phylogenetic networks from triplets, that is rooted trees
on three species. However, although triplets determine or encode rooted
phylogenetic trees, they do not in general encode rooted phylogenetic networks,
which is a potential issue for any such method. Motivated by this fact, Huber
and Moulton recently introduced trinets as a natural extension of rooted
triplets to networks. In particular, they showed that level-1 phylogenetic
networks are encoded by their trinets, and also conjectured that all
"recoverable" rooted phylogenetic networks are encoded by their trinets. Here
we prove that recoverable binary level-2 networks and binary tree-child
networks are also encoded by their trinets. To do this we prove two
decomposition theorems based on trinets which hold for all recoverable binary
rooted phylogenetic networks. Our results provide some additional evidence in
support of the conjecture that trinets encode all recoverable rooted
phylogenetic networks, and could also lead to new approaches to construct
phylogenetic networks from trinets
A quadratic kernel for computing the hybridization number of multiple trees
It has recently been shown that the NP-hard problem of calculating the
minimum number of hybridization events that is needed to explain a set of
rooted binary phylogenetic trees by means of a hybridization network is
fixed-parameter tractable if an instance of the problem consists of precisely
two such trees. In this paper, we show that this problem remains
fixed-parameter tractable for an arbitrarily large set of rooted binary
phylogenetic trees. In particular, we present a quadratic kernel
Uniqueness, intractability and exact algorithms: reflections on level-k phylogenetic networks
Phylogenetic networks provide a way to describe and visualize evolutionary
histories that have undergone so-called reticulate evolutionary events such as
recombination, hybridization or horizontal gene transfer. The level k of a
network determines how non-treelike the evolution can be, with level-0 networks
being trees. We study the problem of constructing level-k phylogenetic networks
from triplets, i.e. phylogenetic trees for three leaves (taxa). We give, for
each k, a level-k network that is uniquely defined by its triplets. We
demonstrate the applicability of this result by using it to prove that (1) for
all k of at least one it is NP-hard to construct a level-k network consistent
with all input triplets, and (2) for all k it is NP-hard to construct a level-k
network consistent with a maximum number of input triplets, even when the input
is dense. As a response to this intractability we give an exact algorithm for
constructing level-1 networks consistent with a maximum number of input
triplets
Exact reconciliation of undated trees
Reconciliation methods aim at recovering macro evolutionary events and at
localizing them in the species history, by observing discrepancies between gene
family trees and species trees. In this article we introduce an Integer Linear
Programming (ILP) approach for the NP-hard problem of computing a most
parsimonious time-consistent reconciliation of a gene tree with a species tree
when dating information on speciations is not available. The ILP formulation,
which builds upon the DTL model, returns a most parsimonious reconciliation
ranging over all possible datings of the nodes of the species tree. By studying
its performance on plausible simulated data we conclude that the ILP approach
is significantly faster than a brute force search through the space of all
possible species tree datings. Although the ILP formulation is currently
limited to small trees, we believe that it is an important proof-of-concept
which opens the door to the possibility of developing an exact, parsimony based
approach to dating species trees. The software (ILPEACE) is freely available
for download
Quantifying the Extent of Lateral Gene Transfer Required to Avert a `Genome of Eden'
The complex pattern of presence and absence of many genes across different
species provides tantalising clues as to how genes evolved through the
processes of gene genesis, gene loss and lateral gene transfer (LGT). The
extent of LGT, particularly in prokaryotes, and its implications for creating a
`network of life' rather than a `tree of life' is controversial. In this paper,
we formally model the problem of quantifying LGT, and provide exact
mathematical bounds, and new computational results. In particular, we
investigate the computational complexity of quantifying the extent of LGT under
the simple models of gene genesis, loss and transfer on which a recent
heuristic analysis of biological data relied. Our approach takes advantage of a
relationship between LGT optimization and graph-theoretical concepts such as
tree width and network flow
Kernelizations for the hybridization number problem on multiple nonbinary trees
Given a finite set , a collection of rooted phylogenetic
trees on and an integer , the Hybridization Number problem asks if there
exists a phylogenetic network on that displays all trees from
and has reticulation number at most . We show two kernelization algorithms
for Hybridization Number, with kernel sizes and
respectively, with the number of input trees and their maximum
outdegree. Experiments on simulated data demonstrate the practical relevance of
these kernelization algorithms. In addition, we present an -time
algorithm, with and some computable function of
- …